import torch
= torch.tensor(2.0, requires_grad=True)
x = x * x + 3 * x
y = y.mean()
z
z.backward()print(x.grad) # 7.0 = d(x^2 + 3x)/dx at x=2
tensor(7.)
Ismail TG
April 5, 2025
As I continue to deepen my understanding of machine learning systems, I’ve realized that knowing how models run is just as important as knowing how to build them. This post kicks off a series where I explore PyTorch internals, starting with two powerful components: Autograd and torch.fx.
PyTorch’s autograd
is a dynamic automatic differentiation engine. It records operations on tensors to build a computation graph during the forward pass, and then traverses that graph in reverse to compute gradients during the backward pass.
When you perform operations on torch.Tensor
objects with requires_grad=True
, PyTorch:
Function
object (e.g., AddBackward
, MulBackward
)..backward()
is called, the engine performs reverse-mode automatic differentiation.Tensor.grad_fn
: Points to the function that created the tensor.Tensor.grad
: Stores the computed gradient.torch.autograd.Function
: Base class for custom differentiable operations.torch.fx
allows you to capture and transform PyTorch programs as Python-level graphs. This is useful for: - Programmatic model transformations - Debugging and visualization - Building custom compiler backends
GraphModule
: A traced model with a modifiable structure.Tracer
: Walks through the model and builds a Graph
.Graph
: Contains Node
objects that represent operations.
import torch
import torch.nn as nn
import torch.fx as fx
class MyModel(nn.Module):
def forward(self, x):
return x * 2 + 3
model = MyModel()
traced = fx.symbolic_trace(model)
print(traced.graph)
graph():
%x : [num_users=1] = placeholder[target=x]
%mul : [num_users=1] = call_function[target=operator.mul](args = (%x, 2), kwargs = {})
%add : [num_users=1] = call_function[target=operator.add](args = (%mul, 3), kwargs = {})
return add
Both Autograd and torch.fx
are essential for understanding what happens under the hood in PyTorch. Whether you’re debugging models, optimizing inference, or building custom backends, mastering these tools opens the door to deeper systems-level work in AI.
In future posts, I plan to explore: - Implementing custom autograd functions - Writing your own FX passes for transformations - Diving into TorchDynamo and TorchInductor